An In-place Framework for Exact and Approximate Shortest Unique Substring Queries
نویسندگان
چکیده
We revisit the exact shortest unique substring (SUS) finding problem, and propose its approximate version where mismatches are allowed, due to its applications in subfields such as computational biology. We design a generic in-place framework that fits to solve both the exact and approximate k-mismatch SUS finding, using the minimum 2n memory words plus n bytes space, where n is the input string size. By using the in-place framework, we can find the exact and approximate k-mismatch SUS for every string position using a total of O(n) and O(n) time, respectively, regardless of the value of k. Our framework does not involve any compressed or succinct data structures and thus is practical and easy to implement.
منابع مشابه
Shortest Unique Substring Queries on Run-Length Encoded Strings
We consider the problem of answering shortest unique substring (SUS) queries on run-length encoded strings. For a string S, a unique substring u = S[i..j] is said to be a shortest unique substring (SUS) of S containing an interval [s, t] (i ≤ s ≤ t ≤ j) if for any i′ ≤ s ≤ t ≤ j′ with j − i > j′ − i′, S[i′..j′] occurs at least twice in S. Given a run-length encoding of size m of a string of len...
متن کاملShortest Unique Queries on Strings
Let D be a long input string of n characters (from an alphabet of size up to 2 , wherew is the number of bits in a machine word). Given a substring q of D, a shortest unique query returns a shortest unique substring of D that contains q. We present an optimal structure that consumes O(n) space, can be built in O(n) time, and answers a query in O(1) time. We also extend our techniques to solve s...
متن کاملProcessing Queries on Road Networks in Spatial Data Base Perspective for Selectivity Estimation
This work mainly focuses on building a framework that is capable of analyzing spatial approximate substring queries, for mainly to solve the selectivity estimation problem of range queries which belongs to road networks represented in spatial databases. The selectivity estimation is nothing but estimating the size of the results i.e., estimating the number of points that presents in a graph whi...
متن کاملA simple yet time-optimal and linear-space algorithm for shortest unique substring queries
Article history: Received 30 March 2014 Accepted 7 November 2014 Available online 13 November 2014 Communicated by G. Ausiello
متن کاملShortest unique palindromic substring queries in optimal time
A palindrome is a string that reads the same forward and backward. A palindromic substring P of a string S is called a shortest unique palindromic substring (SUPS) for an interval [s, t] in S, if P occurs exactly once in S, this occurrence of P contains interval [s, t], and every palindromic substring of S which contains interval [s, t] and is shorter than P occurs at least twice in S. The SUPS...
متن کامل